In [1]:

    
%autosave 10









    














    



Autosaving every 10 seconds

To be clear

Not making programs faster.
Not making websites smaller.
scipy.optimize: mathematical optimisation, reducing functions, etc.

What is it

You have an objective function f(x). Or multiple variables.
Given some constraints find big or small value.

Applications

You have data, and have a numeric way of saying how good something is.

Why Python?

End-to-end nature is great.
Exploration/viz: IPython, matplotlib
Batteries included: scipy.optimize, statsmodels.
Cython, C bindings, glue to other high performance methods.

No structure

If data is random, using np.argmin(f) (brute force) is the best you can do because there is no structure.

Complete structured

Completely linear, or quadratic, functions are easy to solve without numeric methods, let alone brute force, because the structure allows us to do it analytically.

Real problems

Tend to be semi-structured, and often can't be expressed symbolically. Maybe there isn't even a symbolic representation.

scipy.optimize

f = lambda x: np.exp((x-4)**2)
return scipy.optimize.minimize(f, 5)

But this is extremely customisable, lots of methods. Each with strengths and weaknesses. Default is BFGS. Just change a 'method' keyword in scipy.optimize.minimize.

How to choose solver?

If you have a small problem, choose anything (but check the answer!). Why waste time?
Big problem or top speed: you need to understand your problem; is it convex? is it linear? Is there a specialised algorithm that has a smaller big-O complexity?
Actual big data - none of this will work.
- Try online methods like SGD
If you don't understand your problem?
- Genetic algorithms.

Characteristics of each method

Needs derivatives?
Needs constraints?

Check error messages! Might say it isn't sure about solution.

How do problems vary?

Size: few to many dimensions
Smoothness: smooth, or e.g. unit step discontinuities
Global shape: many local minima? strictly convex?

Nelder-Mead

No derivatives needed
No constraints needed
Few assumptions.

Simulated annealing

No derivatives needed, no constraints needed, few assumptions.
Expects multiple minima
Jumps all over, inefficient, but trying hard to find global minima.
Varies temperature: as search continues jumps around less and less, so focuses on minima in smaller neighbourhood.

Newton's Method

Needs derivative.
Uses gradient at current point to jump to new point.
Fast.

Linear programs and PuLP

http://pythonhosted.org/PuLP



In [2]:

    
from pulp import *

x = LpVariable('x', -10, 10)
y = LpVariable('y', -10, 10)
prob = LpProblem("Toy problem", LpMinimize)
prob += 3*x - y
prob.solve()









    Out[2]:





1



In [3]:

    
prob









    Out[3]:





Toy problem:
MINIMIZE
3*x + -1*y + 0
VARIABLES
-10 <= x <= 10 Continuous
-10 <= y <= 10 Continuous



In [4]:

    
(x.value(), y.value())









    Out[4]:





(-10.0, 10.0)

Sudoku example from the PuLP docs

Binary variable for each possible number in each box.
Then solve within constraints.

cvxpy

For problems which aren't linear.
Backed by CVXOPT
Example code for LASSO $L_1$-penalized least squares problem: minimize sum of square errors whilst minimizing the sum of the coefficients.

Feynman solution - just solve it

Analytic solution after thinking hard.
Minimzing $x^2$ seems to have a connection to derivative.
So find point with gradient = 0.

But what if getting derivative is impossible?

numdifftools: numerical approximation. scipy.optimize has some methods.
Automatic derivative methods, not really in Python, Julia, Stan, ADMB
- Track numerical derivatives to derive analytic solution.



In [7]:

    
import sympy as S
from math import *
S.var("x mu sigma", real=True)









    Out[7]:





(x, mu, sigma)



In [8]:

    
f = 1/(sqrt(2*S.pi)*sigma)*S.exp(-(x-mu)**2/(2*sigma**2))
f









    Out[8]:





0.398942280401433*exp(-(-mu + x)**2/(2*sigma**2))/sigma



In [9]:

    
ll = log(f)









    



---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-9-569f5e1c5974> in <module>()
----> 1 ll = log(f)

/Users/ai/Programming/.envs/default/lib/python2.7/site-packages/sympy/core/expr.pyc in __float__(self)
    205         if result.is_number and result.as_real_imag()[1]:
    206             raise TypeError("can't convert complex to float")
--> 207         raise TypeError("can't convert expression to float")
    208 
    209     def __complex__(self):

TypeError: can't convert expression to float

Questions

How to investigate a big problem?
- Plug it into cvxpy and ask it if it's convex
- If just a black box of data, plot it, but only works for low dimensions.
- There are convex optimization methods.



In [ ]: